Speaker Independent Phoneme Classification in Continuous Speech
نویسنده
چکیده
This paper examines statistical models for phoneme classification. We compare the performance of our phoneme classification system using Gaussian mixture (GMM) phoneme models with systems using hidden Markov phoneme models (HMM). Measurements show that our model’s performance is comparable with HMM models in context independent phoneme classification.
منابع مشابه
Improvements in the Stochastic Segment Model for Phoneme Recognition
The heart of a speech recognition system is the acoustic model of sub-word units (e.g., phonemes). In this work we discuss refinements of the stochastic segment model, an alternative to hidden Markov models for representation of the acoustic variability of phonemes. We concentrate on mechanisms for better modelling time correlation of features across an entire segment. Results are presented for...
متن کاملPhoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain
This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...
متن کاملSpeaker, Vocabulary and Context Independent Word Spotting System for Continuous Speech
Word spotting is a widely known subject in continuous speech recognition and the traditional approaches use either hidden Markov models (HMM) or Gaussian mixture models (GMM). In this paper, we propose a different approach based on language independent phoneme modeling. The proposed system is speaker and vocabulary independent, and it is easy to implement. An equal error rate (EER) of 3.34% and...
متن کاملBeating Henry Higgins at His Own Game: A Markovian Approach to Dialectology
1. Introduction The performance of speech recognition algorithms degrades considerably due to speaker variability. Aside from gender, the largest cause for speaker variability is accent. If the accent of a speaker can be determined automatically, then accent-specific speech recognition models can be used, thereby increasing speech recognition accuracy. In this study, the problem of accent class...
متن کاملAn Experimental Real - Time Speech - to - Speech Translation System *
This paper reports the current progress in the SPEECHTRANS project at the Center for Machine Translation which is a speech-to-speech translation project for real-time processing of speaker-independent noisy continuous speech input. SPEECHTRANS uses a custom speech recognition hardware and a phoneme-based generalized LR parser that uses a unification-based grammar formalism and a natural languag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004